empirical coverage
- North America > United States > Pennsylvania (0.05)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > France (0.04)
Symbolic Quantile Regression for the Interpretable Prediction of Conditional Quantiles
Hoekstra, Cas Oude, Hengst, Floris den
Symbolic Regression (SR) is a well-established framework for generating interpretable or white-box predictive models. Although SR has been successfully applied to create interpretable estimates of the average of the outcome, it is currently not well understood how it can be used to estimate the relationship between variables at other points in the distribution of the target variable. Such estimates of e.g. the median or an extreme value provide a fuller picture of how predictive variables affect the outcome and are necessary in high-stakes, safety-critical application domains. This study introduces Symbolic Quantile Regression (SQR), an approach to predict conditional quantiles with SR. In an extensive evaluation, we find that SQR outperforms transparent models and performs comparably to a strong black-box baseline without compromising transparency. We also show how SQR can be used to explain differences in the target distribution by comparing models that predict extreme and central outcomes in an airline fuel usage case study. We conclude that SQR is suitable for predicting conditional quantiles and understanding interesting feature influences at varying quantiles.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Research Report > Experimental Study (0.67)
- Research Report > New Finding (0.46)
- Transportation > Air (1.00)
- Health & Medicine (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Data Science > Data Mining (0.66)
Distribution-Free Uncertainty-Aware Virtual Sensing via Conformalized Neural Operators
Kobayashi, Kazuma, Garg, Shailesh, Ahmed, Farid, Chakraborty, Souvik, Alam, Syed Bahauddin
Robust uncertainty quantification (UQ) remains a critical barrier to the safe deployment of deep learning in real-time virtual sensing, particularly in high-stakes domains where sparse, noisy, or non-collocated sensor data are the norm. We introduce the Conformalized Monte Carlo Operator (CMCO), a framework that transforms neural operator-based virtual sensing with calibrated, distribution-free prediction intervals. By unifying Monte Carlo dropout with split conformal prediction in a single DeepONet architecture, CMCO achieves spatially resolved uncertainty estimates without retraining, ensembling, or custom loss design. Our method addresses a longstanding challenge: how to endow operator learning with efficient and reliable UQ across heterogeneous domains. Through rigorous evaluation on three distinct applications: turbulent flow, elastoplastic deformation, and global cosmic radiation dose estimation-CMCO consistently attains near-nominal empirical coverage, even in settings with strong spatial gradients and proxy-based sensing. This breakthrough offers a general-purpose, plug-and-play UQ solution for neural operators, unlocking real-time, trustworthy inference in digital twins, sensor fusion, and safety-critical monitoring. By bridging theory and deployment with minimal computational overhead, CMCO establishes a new foundation for scalable, generalizable, and uncertainty-aware scientific machine learning.
Distributionally Robust Predictive Runtime Verification under Spatio-Temporal Logic Specifications
Zhao, Yiqi, Zhu, Emily, Hoxha, Bardh, Fainekos, Georgios, Deshmukh, Jyotirmoy V., Lindemann, Lars
Cyber-physical systems (CPS) designed in simulators, often consisting of multiple interacting agents (e.g. in multi-agent formations), behave differently in the real-world. We want to verify these systems during runtime when they are deployed. We thus propose robust predictive runtime verification (RPRV) algorithms for: (1) general stochastic CPS under signal temporal logic (STL) tasks, and (2) stochastic multi-agent systems (MAS) under spatio-temporal logic tasks. The RPRV problem presents the following challenges: (1) there may not be sufficient data on the behavior of the deployed CPS, (2) predictive models based on design phase system trajectories may encounter distribution shift during real-world deployment, and (3) the algorithms need to scale to the complexity of MAS and be applicable to spatio-temporal logic tasks. To address the challenges, we assume knowledge of an upper bound on the statistical distance between the trajectory distributions of the system at deployment and design time. We are motivated by our prior work [1, 2] where we proposed an accurate and an interpretable RPRV algorithm for general CPS, which we here extend to the MAS setting and spatio-temporal logic tasks. Specifically, we use a learned predictive model to estimate the system behavior at runtime and robust conformal prediction to obtain probabilistic guarantees by accounting for distribution shifts. Building on [1], we perform robust conformal prediction over the robust semantics of spatio-temporal reach and escape logic (STREL) to obtain centralized RPRV algorithms for MAS. We empirically validate our results in a drone swarm simulator, where we show the scalability of our RPRV algorithms to MAS and analyze the impact of different trajectory predictors on the verification result. To the best of our knowledge, these are the first statistically valid algorithms for MAS under distribution shift.
- North America > United States > California (0.14)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- (4 more...)
- Instructional Material > Course Syllabus & Notes (0.45)
- Research Report > New Finding (0.34)
Improving the statistical efficiency of cross-conformal prediction
Gasparin, Matteo, Ramdas, Aaditya
Conformal prediction has emerged as a general and versatile framework for constructing prediction sets in regression and classification tasks (Shafer and Vovk, 2008). Unlike traditional methods, which often depend on rigid distributional assumptions, conformal prediction transforms point predictions from any prediction (or black-box) algorithm into prediction sets that guarantee valid finite-sample marginal coverage. Originally introduced by Vovk et al. (2005), it has become increasingly influential, with numerous methods and extensions being proposed since its introduction. In particular, full conformal prediction by Vovk et al. (2005), demonstrates favorable properties regarding the coverage and the size of the prediction set. However, these advantages are counterbalanced by a substantial computational cost, which limits its practical application.
- North America > United States (0.14)
- Europe > Greece (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
Adaptive Conformal Inference by Betting
Podkopaev, Aleksandr, Xu, Darren, Lee, Kuang-Chih
Conformal prediction is a valuable tool for quantifying predictive uncertainty of machine learning models. However, its applicability relies on the assumption of data exchangeability, a condition which is often not met in real-world scenarios. In this paper, we consider the problem of adaptive conformal inference without any assumptions about the data generating process. Existing approaches for adaptive conformal inference are based on optimizing the pinball loss using variants of online gradient descent. A notable shortcoming of such approaches is in their explicit dependence on and sensitivity to the choice of the learning rates. In this paper, we propose a different approach for adaptive conformal inference that leverages parameter-free online convex optimization techniques. We prove that our method controls long-term miscoverage frequency at a nominal level and demonstrate its convincing empirical performance without any need of performing cumbersome parameter tuning.
- Europe > Austria > Vienna (0.14)
- Oceania > Australia > New South Wales (0.04)
- Europe > United Kingdom > Wales (0.04)
- (2 more...)
Multivariate and Online Transfer Learning with Uncertainty Quantification
Hickey, Jimmy, Williams, Jonathan P., Reich, Brian J., Hector, Emily C.
Untreated periodontitis causes inflammation within the supporting tissue of the teeth and can ultimately lead to tooth loss. Modeling periodontal outcomes is beneficial as they are difficult and time consuming to measure, but disparities in representation between demographic groups must be considered. There may not be enough participants to build group specific models and it can be ineffective, and even dangerous, to apply a model to participants in an underrepresented group if demographic differences were not considered during training. We propose an extension to RECaST Bayesian transfer learning framework. Our method jointly models multivariate outcomes, exhibiting significant improvement over the previous univariate RECaST method. Further, we introduce an online approach to model sequential data sets. Negative transfer is mitigated to ensure that the information shared from the other demographic groups does not negatively impact the modeling of the underrepresented participants. The Bayesian framework naturally provides uncertainty quantification on predictions. Especially important in medical applications, our method does not share data between domains. We demonstrate the effectiveness of our method in both predictive performance and uncertainty quantification on simulated data and on a database of dental records from the HealthPartners Institute.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Alaska (0.04)
- North America > United States > Virginia (0.04)
- (3 more...)
- Research Report (0.81)
- Instructional Material > Online (0.50)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.87)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Calibrating Bayesian Generative Machine Learning for Bayesiamplification
Bieringer, Sebastian, Diefenbacher, Sascha, Kasieczka, Gregor, Trabs, Mathias
The upcoming high-luminosity runs of the LHC will push the quantitative frontier of data taking to over 25-times its current rates. To ensure precision gains from such high statistics, this increase in experimental data needs to be met by an equal amount of simulation. The required computational power is predicted to outgrow the increase in budget in the coming years [1, 2]. One solution to this predicament is the augmentation of the expensive, Monte Carlo-based, simulation chain with generative machine learning. A special focus is often put on the costly detector simulation [3, 4]. This approach is only viable under the assumption that the generated data is not statistically limited to the size of the simulated training data. Previous studies have shown, for both toy data [5] and calorimeter images [6], that samples generated with generative neural networks can surpass the training statistics due to powerful interpolation abilities of the network in data space. These studies rely on comparing a distance measure between histograms of generated data and true hold-out data to the distance between smaller, statistically limited sets of Monte Carlo data and the hold-out set. The phenomenon of a generative model surpassing the precision of its training set is also known as amplification.
- Europe > Germany > Hamburg (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Transformer Conformal Prediction for Time Series
Lee, Junghwan, Xu, Chen, Xie, Yao
Uncertainty quantification has become crucial in many scientific domains where black-box machine learning models are often used [1]. Conformal prediction has emerged as a popular and modern technique for uncertainty quantification by providing valid predictive inference for those black-box models [8, 2]. Time series prediction aims to forecast future values based on a sequence of observations sequentially ordered in time [3]. With recent advances in machine learning, numerous models have been proposed and adopted for various time series prediction tasks. The increased use of black-box machine learning models necessitates uncertainty quantification, particularly in high-stakes time series prediction tasks such as medical event prediction, stock prediction, and weather forecasting. While conformal prediction can provide valid predictive inference for uncertainty quantification, applying conformal prediction to time series is challenging since time series data often violate the exchangeability assumption.
- Oceania > Australia > New South Wales (0.05)
- North America > United States (0.05)